2 research outputs found

    Virtuoso: High Resource Utilization and {\mu}s-scale Performance Isolation in a Shared Virtual Machine TCP Network Stack

    Full text link
    Virtualization improves resource efficiency and ensures security and performance isolation for cloud applications. To that end, operators today use a layered architecture that runs a separate network stack instance in each VM and container connected to a separate virtual switch. Decoupling through layering reduces complexity, but induces performance and resource overheads that are at odds with increasing demands for network bandwidth, communication requirements for large distributed applications, and low latency. We present Virtuoso, a new software networking stack for VMs and containers. Virtuoso performs a fundamental re-organization of the networking stack to maximize CPU utilization, enforce isolation, and minimize networking stack overheads. We maximize utilization by running one elastically shared network stack instance on dedicated cores; we enforce isolation by performing central and fine-grained per-packet resource accounting and scheduling; we reduce overheads by building a single-layer data path with a one-shot fast-path incorporating all processing from the TCP transport layer through network virtualization and virtual switching. Virtuoso improves resource utilization by up to 50%, latencies by up to 42% compared to other virtualized network stacks without sacrificing isolation, and keeps processing overhead within 11.5% of unvirtualized network stacks.Comment: Under submission for conference peer revie

    Large scale federated analytics and differential privacy budget preservation

    No full text
    This thesis presents two contributions. The first contribution deals with the problem of siloed data collection and prohibitive data acquisition costs. These costs limit the size and diversity of datasets used in health research. Access to larger and more diverse datasets improves the understanding of disease heterogeneity and facilitates inference of relationships between surgical and pathological findings with symptomatic indicators and outcomes. Unfortunately, freely enabling access to these datasets has the potential of leaking private information, such as medical records, even when these datasets have been stripped of personally identifiable information. In the first part of this thesis, we present LEAP, a data analytics platform with support for federated learning. LEAP allows users to analyze data distributed across multiple institutions in a private and secure manner, without leaking sensitive patient information. LEAP achieves this through an infrastructure that maintains privacy by design and brings the computation to the data, instead of bringing the data to the computation. LEAP adds an overhead of up to 2.5X, training Resnet-18 with 15 participating sites, when compared to a centralized model. Despite this overhead, LEAP achieves convergence of the model’s accuracy within 20% of the time taken for the centralized model to converge. One of the techniques used by LEAP to preserve the privacy of sensitive queries is differential privacy. Successive DP queries to a dataset depletes the privacy budget. When the privacy budget is depleted, data curators must block access to the underlying dataset to prevent private information from leaking. In the second part of this thesis, we present a system called the SmartCache. The SmartCache optimizes the use of the privacy budget by interpolating old query results to help answer new queries using a synthetic dataset. Queries answered from the synthetic dataset have a smaller privacy cost, so more queries can be answered before the budget runs out. For statistical queries, the SmartCache saved 30%-50% of the budget for threshold values of 0.99 and 0.999, and for gradient queries it consumed 70% less of the privacy budget when training a fully connected model.Science, Faculty ofComputer Science, Department ofGraduat
    corecore